Sarcasm Detection in Chinese Using a Crowdsourced Corpus

نویسندگان

Shih-Kai Lin

Shu-Kai Hsieh

چکیده

Based on the assumption that comment with positive sentimental polarity to a negative issue has high probability to be a sarcasm, we propose a simple yet efficient method to collect sarcastic textual data by crowdsourcing with social media and merging game with a purpose approach. Taking advantage of Facebook's reaction button, posts triggering strong negative emotion are collected. Next, by using PTT's search engine, we successfully connect PTT's comments to the collected posts in Facebook and build the sarcasm corpus. Based on the corpus data, the performance comparison of sarcasm detection between SVM with naïve features and Convolutional Neural Network models is conducted. An impressive accuracy rate and great potentials of the corpus are demonstrated.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing

The ability to reliably identify sarcasm and irony in text can improve the performance of many Natural Language Processing (NLP) systems including summarization, sentiment analysis, etc. The existing sarcasm detection systems have focused on identifying sarcasm on a sentence level or for a specific phrase. However, often it is impossible to identify a sentence containing sarcasm without knowing...

متن کامل

A Large Self-Annotated Corpus for Sarcasm

We introduce the Self-Annotated Reddit Corpus (SARC), a large corpus for sarcasm research and for training and evaluating systems for sarcasm detection. The corpus has 1.3 million sarcastic statements — 10 times more than any previous dataset — and many times more instances of non-sarcastic statements, allowing for learning in both balanced and unbalanced label regimes. Each statement is furthe...

متن کامل

"sure, I Did the Right Thing": a System for Sarcasm Detection in Speech

While a fair amount of work has been done on automatically detecting emotion in human speech, there has been little research on sarcasm detection. Although sarcastic speech acts are inherently subjective, humans have relatively clear intuitions as to what constitutes sarcastic speech. In this paper, we present a system for automatic sarcasm detection. Using a new acted speech corpus that is ann...

متن کامل

Putting Sarcasm Detection into Context: The Effects of Class Imbalance and Manual Labelling on Supervised Machine Classification of Twitter Conversations

Sarcasm can radically alter or invert a phrase’s meaning. Sarcasm detection can therefore help improve natural language processing (NLP) tasks. The majority of prior research has modeled sarcasm detection as classification, with two important limitations: 1. Balanced datasets, when sarcasm is actually rather rare. 2. Using Twitter users’ self-declarations in the form of hashtags to label data, ...

متن کامل

Sarcasm Detection on Czech and English Twitter

This paper presents a machine learning approach to sarcasm detection on Twitter in two languages – English and Czech. Although there has been some research in sarcasm detection in languages other than English (e.g., Dutch, Italian, and Brazilian Portuguese), our work is the first attempt at sarcasm detection in the Czech language. We created a large Czech Twitter corpus consisting of 7,000 manu...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Sarcasm Detection in Chinese Using a Crowdsourced Corpus

نویسندگان

چکیده

منابع مشابه

Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing

A Large Self-Annotated Corpus for Sarcasm

"sure, I Did the Right Thing": a System for Sarcasm Detection in Speech

Putting Sarcasm Detection into Context: The Effects of Class Imbalance and Manual Labelling on Supervised Machine Classification of Twitter Conversations

Sarcasm Detection on Czech and English Twitter

عنوان ژورنال:

اشتراک گذاری